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BIG EARS VOICE 


INTERFACE 


Connection Details for UK101 and 


ОН10 SUPERBOARD 


Introduction: Big Ears connects 


directly to the computer 


board via its standard 5 PIN DIN socket and the cable 


supplied. ov 
INH 


+5 
Fë KEK Fi 


© E 


white 


еу INK 


*5" screen 


yellow 


Connections are made as follows: 


FUNCTION DIN CABLE 
+5V = ІС4 
ІМН m IC2 
OV SCREEN any 
FO MU ICA 
FL =s IC4 
Note: 


IC pins are numbered thus 


' e 
TOP VIEW 


Іс? 


90000000” 


© 


UNDERSIDE VIEW 


ғы blue 
red 


COMPUTER BOARD 
(SOLDER TO UNDERSIDE) 


PIN 14 (or any +5V RAIL) 
PIN 15 

OV RAIL (or IC4 PIN 7) 
PIN 2 

PIN 12 


please check connections very carefully! 


+ 
ю 


0000000 
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16 40 LD >, 64 64 St. ples 
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oD Dec C An 160 dane ? 
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FD 8600 АБ» (ry+ o) 
37 ЗЕ ScF ccF Chee caw 
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Instructions for NASCOM Computers 
© 1980 William Stuart Systems Ltd. 


4/ Load Tee MACHINE Cope %тне BASIC 
Расел, AND TYPE RUN. 
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Software for UK101/Superboard 


This is supplied as a Basic Program Listing and should be 
entered and SAVED in two stages: 


First enter and SAVE lines 1-19 (this is the part which 
sets up a short machine code routine ) - "Machine Code 
Loader". 


Secondly (after typing NEW) enter and SAVE the remainder. 
Check carefully for errors - "Analysis Program". 


Note that the Machine Code will be loaded into the top of 
user memory, i.e. the 8th K. It is therefore essential to 
have memory chips inserted into the last pair of sockets, 
the minimum amount of memory being 5K in total, inserted 


as 1, 2, 3, 4 and 8. 


ЕПІПГПНШЕНЕ 
Loading Procedure (UK101/Superboard) 


l. RESET system 


2. MEMORY SIZE ? 7679 [only if full 8K present - 
(carriage return to if 5K then type 4095] 
all other questions) 

3. LOAD 

(load in Machine Code Loader) 
4. RUN 
(if the word ERROR is printed, then a mistake 
OK exists in the load Data - check carefully) 
5. NEW 
(deletes the Loader) 
6. LOAD 
(load in the Analysis Program) 
7% RUN 


Program Types 


LEARN OR TEST? 


ЗЫ Instructions for using "Big Ears" Speech 


Recognition System 


(Demonstration Software) 


l. Set up microphone on table about 1 foot from speaker's 
mouth and positioned so that it is possible to speak 
directly into it without turning away from screen. 


2. Load Program as indicated in Section 2 and type RUN. 


3. The computer will ask 
LEARN OR TEST? 
NEW WORD NUMBER? 
TYPE IN WORD? 


PLEASE SAY APPLES 


NOW! 


OKAY 


34529 


20 


i 2 


PLEASE 


NOW! 


0 


0 


etc. 


eto. 


LEARN OR TEST? 


etc. 


etc. 


(L selects "learn" mode) 
(type a number between 1 & 6) 


(type in word 1) 


(say the word loud and clear. 
Note: one character in the 

top row of the screen will 
"spin" until sound has been 
detected. This indicates that 
Big Ears is waiting for you 
and is a good sign that the 
background noise is low 
enough). 


(Array of numbers shows how 
the word has been stored - 
this is the "voiceprint"). 


(The word must be repeated 
four times. The voiceprint 
is printed each time). 


(Carry on teaching words 
2, 3, 4, 5, 6*). 


After 2 or more words have been “taught", you can try 
out the recognition software. 


LEARN OR TEST? T 

PLEASE SPEAK 

NOW! (say one of the words) 
OKAY 

35 10 3 0 9"0 


^ 12200 


3 4---- (voiceprint printed) 
APPLES 256.1 
PEARS 265.3 (correlation table printed) 


RASPBERRIES 270.3 


YOU SAID RASPBERRIES (word with highest correlation 
is indicated) 


After you have experimented with the System, you can remove 
the "voiceprint" printout by deleting line 1175. 


The correlation table can be Similarly suppressed by deleting 
lines 2085, 2086 and 2087. 


If more than 5K of memory is available, the number of words 
in the vocabulary can be increased. Line 21 sets 


VL = Vocabulary Length, and 
LR = Number of Redetitions when learning. 
(For optimum recognition, set LR = 8 and limit VL to around 


10. Extension to a much larger vocabulary is discussed in 
the Theory Section). 


7. Remember that recognition depends on 
clarity of speech 


similarity of words - very similar words will always 
be difficult to distinguish. In this respect the 
vowel content is the dominant feature, thus "pine" and 
"fine" might be difficult to separate. 


8. Line 2090 prints the result of the recognition process 
"YOU SAID..." 


Some entertaining effects can be had by changing this to 
remove the words YOU SAID then, when teaching new words, to 
type in not the spoken word but the desired "reply". Thus, 
type in the phrase "I'M A COMPUTER" but repeat (teach) the 
phrase "WHO ARE YOU". Remember that any word or phrase 
Spoken must last for a maximum of 1 second or it will be 
incorrectly learned. 


Warning: When using BIG EARS with the UK101 New Monitor, 
the following changes are required: 


Hardware Connections 
FO to IC5 pin 9 
Fl to IC5 pin 5 
INH to IC2 pin 16 
+5 to IC4 pin 14 (or 5v rail) 
Ov to IC4 pin 7 (or 0у rail) 
Software 
4 DATA 173,0,223,133, 34,69,35,42 
5 DATA 144,3,254, 128,30,42,144, 3 
6 DATA 254, 192, 30, 42,144, 3, 234, 234 


13 DATA 232,224, 64, 208, 175, 96, 10668, 0 
4002 POKE 57088, 254 


Theory of Operation 


Operation is based on frequency analysis of the first and 
second formants of the speech waveform. The Interface unit 
separates the formants and delivers digital pulses to the 
computer which counts the changes of state (of each formant 
in each of 64 16 mS sampling periods). This is performed 
by machine code. 


For each period, the two formant counts are then compared 
against threshold data values to determine which of 

5 (formant 2) or 6 (formant 1) frequency ranges are present. 
The two range indices are now used to determine the 
location in a two-dimensional (5 x 6) array which will be 
incremented. This is, therefore, a kind of "frequency- 
space" and the 64 samples must all fit into it as a 
2-dimensional histogram. 


When “learning" a word, four or more such histograms are 
averaged, normalised to have a mean value of zero and a 
uniform standard deviation. The resulting "voiceprint" is 
then stored for future correlation. 


Software Details 


The software is written mainly in subroutines for ease of 
incorporation into your own applications. 


Line Number Function 
4000-4060 "Listen" Subroutine - called by GOSUB 4000. 


This sets up the call to the machine code 
(USR) subroutine, enables the hardware 
interface unit (UK101/Superboard only), 
clears the input buffers and executes the 
machine code for real-time voice acquisition. 
The messages "NOW" and "OKAY" are printed 
before and after acquisition respectively. 


1000-1180 "Classify" subroutine - the two input 
buffers are processed to produce the 30 
element histogram P(30). 


Line 1175 calls an optional printout of the 
histogram. 


2000-2095 "Correlate" subroutine - the input histogram 
P(30) is multiplied element by element with 
each stored Voiceprint and summed to produce 
correlation results CC (VL). The results 
are then searched to select the highest value 
and that word is printed. Lines 2085-2087 
give an (optional) printout of all the 
correlation results. The routine returns 
with BW set to the word number recognised. 
This can, of course, be used to take action 
dependent of the application. 


Line Number Function 


3000-3098 "Learn" subroutine - invites the user to 
create or update his vocabulary. 


Words must be repeated LR times (LR - 4 to 8) 
in order to give a statistically good 
Voiceprint.  Voiceprints are stored for 

each of VL (Vocabulary Length) words, in 

the two-dimensional 30 x VL array 

VP (VL, 30). 


The text strings for each word are stored 
as array VW$ (VL). 


The routine returns with BW - word number. 


5000-5050 "Pattern Print" Subroutine - can be used to 
print out the P(30) array, which contains 
the most recently spoken voice pattern. 


Extending to Large Vocabularies 


The key to the successful implementation of large vocabularies 
lies in structuring the application so that the expected 
response is always one of a reasonably small set of words, 
with the initial set of words consisting of key words which 
lead to the next group. 


Thus, a Travel reservation system might initially ask 
"Inland, European or Intercontinental?", to which each of 

the three replies will lead to a list of, say, 8 or 10 
possible destinations. If the destination lists need to be 
extended, then the word "other" could be included in each 
one and the program organised to call in the subsequent list. 


Program implementation is best achieved as follows. Set 
VL (Vocabulary Length) in line 20 = total number of words 
to be stored. Define a control array of (say) 10 elements 
by adding the line 20 DIM CA (10). 


Then change the following lines: 


2010 FOR Q = 1 TO 10: WD = CA (Q) 
2040 NEXT Q 
2055 BW = CA (l): BC = CC (CA[1]) 
2060 FOR Q = 2 TO 10: WD = CA (Q) 
2080 NEXT Q 


2085) 
2086) omit 
2087) 


The correlation routine will now attempt to match only 
those 10 words whose word numbers are held in array CA (10). 
The master program (lines 110 to 150 in the demonstration 
software) must now be modified to set up CA (1) to CA (10) 
with the "expected" word numbers before asking questions. 


5. 


A useful hint is to leave certain "master" words permanently 
in the control array - e.g. "RESET" as word 1 could be 

used to revert to an initial dialogue no matter where the 
conversation had reached, and "RUBOUT" as word 2 could be 
used to allow the speaker another attempt if he sees that 
his word has been incorrectly recognised. As before, the 
result of GOSUB 2000 (correlate) is always a printout of 

the word and BW is set to the word number. 


BASIC SOFTWARE for Speech Recognition 


MACHINE CODE LOADER (SPEECH INPUT) c 1980 Xm Stuart Systems Ltd. 
FOR UK101/SUPERBOARD 


REM SPEECH LOADER (C) 1980 WM STUART SYSTEMS 
XX=7 680 

DATA 162, 0,134, 33,169,160,133, 32 
DATA 173, 0,223,133, 34, 69, 35,106 
DATA 144, 3,254,128, 30,106,144, 3 
DATA 254,192, 30,106,144, 3,234,234 
DATA 234,160, 16,136, 208,253,165, Зі 
DATA 133, 35,198, 32,208,218,165, 33 
DATA 208, 30,221,128, 30, 24,125,192 
10 DATA 30,238, 32,208, 74,201, 8, 16 
11 DATA 13,169, 0,157,128, 30,157,192 
12 DATA 30,234,234, 234, 240,182,230, 33 
15 DATA 232,224, 64,208,175, 96,10860, O 


14 CS=0 

15 FOR N=0 TO 85 

16 READ DD: POKE XX+N,DD: CS=CS+DD 
17 NEXT 

18 READ DD 

19 IF CS<>DD THEN PRINT"ERROR" 


© Oo 22 ON Vn £7 M (V I7 


20 REM SPEECH RECOGNITION (C) WM. STUART SYSTEMS 1980 


21 VL = 6 : LR = 4 
22 XX = 7680 : REM USER MEMORY 

24 BØ = XX + 128 : Bl = BØ + 64 

25 DIM P(30), CC(VL), VP(VL,30), VW$(VL), PN(30) 
30 РОВ N = 1 ТО 5 

32 READ В (М), В! (М) 

35 МЕХТ 

38 DATA 6, 32, 13, 48, 19, 64, 25, 80, 32, 100 
40 POKE 530,1 : REM DISABLE CTRL/C UK101/SUPERBD 
100 INPUT "LEARN OR TEST"; A$ 

105 IF A$ = "L" THEN GOTO 200 

110 PRINT "PLEASE SPEAK" 

120 GOSUB 4000 : REM LISTEN 

140 GOSUB 2000 : REM CORRELATE & PRINT 

150 GOTO 100 

200 GOSUB 3000 : REM GENERATE 


210 GOTO 100 


1000 REM CLASSIFY INTO FREQ SPACE 
1010 FOR EL = 1 TO 30 : P(EL) = Ø : NEXT 

1020 РОК К = Ø TO 63 

1030 FO = 1 

1050 IF RỌ(FØ) > РЕЕК(ВФ + К) THEN 1100 

1060 FØ = FØ + 1 

1070 IF FØ < 6 THEN 1050 

1100 Fl = 1 

1120 IF В! (Е1) > PEEK (Bl + К) THEN 1150 

1130 Fl = Е] + 1 

1140 IF Fl < 5 THEN 1120 

1150 EL -(Fl-1) * 6 + Fg 

1160 P(EL) = P(EL) + 1 

1170 NEXT K 

1175 GOSUB 5000 : REM (OPTIONAL) PRINT OF FREQ SPACE 
1180 RETURN 


2000 
2010 
2015 
2020 
2030 


2035 - 


2040 
2050 
2055 
2060 
2070 
2075 
2080 
2085 
2086 
2087 
2090 
2095 


3000 
3005 
3010 
3020 
3025 
3030 
3040 
3045 
3050 
3060 
3065 
3070 
3075 
3080 
3082 
3084 
3086 
3088 


REM CORRELATE & ID 
FOR WD = 1 TO VL 
CC (WD) Ø 

FOR EL = 1 TO 30 
CC (WD) = CC(WD) + 
NEXT EL 

NEXT WD 

REM NOW FIND BEST 
BW = 1 : BC = СС(1 
FOR WD = 2 TO VL 
IF CC (WD) < BC T 
BW = WD :BC = CC(W 
NEXT WD 

FOR WD = 1 TO VL 


Ш 


ENTIFY 


P(EL) * VP(WD,EL) 


) 


HEN GOTO 2080 
D) 


PRINT VW$(WD), CC(WD) OPTIONAL: 


NEXT 
PRINT : PRINT "YOU 
RETURN 


REM VOICEPRINT GEN 


SAID"; VW$(BW) 


INPUT "NEW WORD NUMBER"; WD 


INPUT "TYPE IN WOR 
FOR EL = 1 TO 30 
PN (EL) 
NEXT EL 
FOR N = 1 TO LR 
PRINT "PLEASE SAY" 
GOSUB 4000 : PRINT 
FOR EL = 1 TO 30 
PN(EL) = PN(EL) + 
NEXT EL 


" 
5 


NEXT N 

s = @ 

FOR EL = 1 TO 30 
VP(WD,EL) = PN(EL) 


S = S + VP(WD,EL) 
NEXT EL 


D"; Vw$ (WD) 


; VW$ (WD) 
"THANK YOU" 


P (EL) 


/LR - 2.133 
42 


PRINTS ALL WORD SCORES 


PRINT 


PRINT 


3090 S = SQR(S) 

3092 FOR EL = 1 TO 30 

3094 VP(WD,EL) = 8 * VP(WD,EL)/S 
3096 NEXT EL 

3098 RETURN 


4000 REM LISTEN 
4002 POKE 57088,253 : REM AUDIO ON (0К101 & SUPERBD ONLY) 
4005 РОКЕ 11,0 
4008 РОКЕ 12,30 
4010 FOR X = Ø TO 63 

4020 POKE BØ + X, Ø 

4025 POKE Bl X, Ø 

4030 NEXT X 

4035 PRINT "NOW!" 

4040 X - USR(X) 

4041 POKE 57088,255 : REM AUDIO OFF (0К101 & C) 
4045 PRINT "OKAY" 

4050  GOSUB 1000 : REM CLASSIFY 

4060 RETURN 


[UK101/SUPERBOARD USR ADDRESS 


5000 FOR R = 1 TO 5 : REM PATTERN PRINT 
5005  FORC-1TO6 

5010 PRINT P(C + (R-1) * 6); 

5020 NEXT C 

5030 PRINT 

5040 NEXT R 

5050 RETURN 


Routines for Saving & Loading Voiceprints on UK101 


7000 REM SAVE VOICEPRINTS 

7010 PRINT "TAPE ON, THEN TYPE X«RETURN" 

7020 INPUT A$ 

7030 SAVE | 

7040 FOR WD-1 TO VL 

7042 A$ = VW$(WD) 

7044 IF A$ =o THEN A$ m ЫҚ " 

7050 PRINT A$ 

7060 FOR EL = 1 TO 30 

7070 PRINT VP (WD, EL) 

7080 NEXT 

7090 NEXT 
PRINT "DONE" 
POKE 517,0 (restores full speed print) 
GOTO 100 i.e. clears IRE flag 


REM LOAD VOICE PRINTS 

INPUT "ТАРЕ ON PLAY, THEN TYPE X+RETURN";A$ 
LOAD 

FOR WD - 1 TO VL 

INPUT VW$ (WD) 

FOR EL = 1 TO30 0 

INPUT VP (WD, EL) 


NEXT 
GOTO 100 


BIG EARS - - USER NEWS 


Software 


Try this modification, which has the effect of reducing 
the weighting given to the first Voiceprint element. 

This is the ‘all low' count, and corresponds to the 
silent part of the listening period. By reducing its 
weight, the non-silent parts are correspondingly accent- 
uated, and the system should be less sensitive to the 
duration of the words. 


3076 X-PN(1): PN(1)=PN(1) /4 

3077 NS-LR*64-X«PN(1) : AV-(NS/LR)/30 
3084 VP(WD, EL)=PN(EL) /LR-AV 

2005 Р(1)-Р(1)/4 


Note: For good results set LR-8, i.e. 
20 VL-11 : LR-8 (11 word vocabulary) 


Hardware 


The sensitivity of Big Ears is adjustable. To change it, 
remove the cabinet's lid and turn the small preset potent- 
iometer with a small screwdriver: clockwise to increase, 
anti-clockwise to decrease. Beware of excess sensitivity. 
If no word is spoken, the software should listen indefinitely, 
and if triggered by a very short sound the Voiceprint 
Should show virtually all 64 counts in the top left-hand 
location. If this is not the case then the sensitivity is too 


great for the background noise level. 


(с) 1981 William Stuart Systems 


